Efficient SIMD Vectorization for Hashing in OpenCL

نویسندگان

  • Tobias Behrens
  • Viktor Rosenfeld
  • Jonas Traub
  • Sebastian Breß
  • Volker Markl
چکیده

Hashing is at the core ofmany efficient database operators such as hash-based joins and aggregations. Vectorization is a technique that uses Single Instruction Multiple Data (SIMD) instructions to process multiple data elements at once. Applying vectorization to hash tables results in promising speedups for build and probe operations. However, vectorization typically requires intrinsics – low-level APIs in which functions map to processorspecific SIMD instructions. Intrinsics are specific to a processor architecture and result in complex and difficult to maintain code. OpenCL is a parallel programming framework which provides a higher abstraction level than intrinsics and is portable to different processors. Thus, OpenCL avoids processor dependencies, which results in improved code maintainability. In this paper, we add efficient, vectorized hashing primitives to OpenCL. Our results show that OpenCL-based vectorization is competitive to intrinsics on CPUs but not on Xeon Phi coprocessors.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multi-tier Dynamic Vectorization for Translating GPU Optimizations into CPU Performance

Developing high performance GPU code is labor intensive. Ideally, developers could recoup high GPU development costs by generating high-performance programs for CPUs and other architectures from the same source code. However, current OpenCL compilers for non-GPUs do not fully exploit optimizations in well-tuned GPU codes. To address this problem, we develop an OpenCL implementation that efficie...

متن کامل

CLVectorizer: A Source-to-Source Vectorizer for OpenCL Kernels

While many-core processors offer multiple layers of hardware parallelism to boost performance, applications are lagging behind in exploiting them effectively. A typical example is vector parallelism(SIMD), offered by many processors, but used by too few applications. In this paper we discuss two different strategies to enable the vectorization of naive OpenCL kernels. Further, we show how these...

متن کامل

Regular and almost universal hashing: an efficient implementation

Random hashing can provide guarantees regarding the performance of data structures such as hash tables— even in an adversarial setting. Many existing families of hash functions are universal: given two data objects, the probability that they have the same hash value is low given that we pick hash functions at random. However, universality fails to ensure that all hash functions are well behaved...

متن کامل

Boosting the Performance of Multimedia Applications Using SIMD Instructions

Modern processors’ multimedia extensions (MME) provide SIMD ISAs to boost the performance of typical operations in multimedia applications. However, automatic vectorization support for them is not very mature. The key difficulty is how to vectorize those SIMD-ISA-supported idioms in source code in an efficient and general way. In this paper, we introduce a powerful and extendable recognition en...

متن کامل

Vectorization of the 2D Wavelet Lifting Transform Using SIMD Extensions

This paper addresses the vectorization of the lifting-based wavelet transform on general-purpose microprocessors in the context of JPEG2000. Since SIMD exploitation strongly depends on an efficient memory hierarchy usage, this research is based on previous work about cacheconscious DWT implementations [1,2,3]. The experimental platform on which we have chosen to study the benefits of the SIMD e...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2018